Report with R Markdown - Theory
R Markdown is an authoring framework for data science (AFDS) combining R code with narrative text. AFDS are used to create data-driven documents (reports) with a notebook interface.
| AFDS | Language | Description |
|---|---|---|
| Rmarkdown | R | front-end oriented |
| Jupyter | Python | back-end oriented |
| D3.js | JavaScript | web oriented |
| Plotly | JavaScript | web oriented |
R Markdown is often enough for academic writing. It uses Markdown syntax, supports HTML/CSS markup languages, BibTex references, 3D plots, interactive maps. The main R Markdown packages are:
knitr} engine for dynamic report generation with R (e.g. convert to LaTeX)rmarkdown} convert R Markdown documents into a variety of formats, extension of {knitr}tinytex} typesetting system for LaTeX syntax (e.g. export in PDF)bookdown} advanced editing and publishing functions (e.g. cross reference figures)R Markdown combines:
a YAML header for the document metadata
different code chunks (R code embedded)
many narrative parts (Markdown syntax) with inline R code
|
YAML header |
|
narrative part |
|
R code |
|
narrative part |
|
R code |
|
… |
|
… |
Narrative parts required principally the use of the Markdown syntax, but also supports HTML/CSS code, LaTeX syntax, etc.
Markdown is an easy-to-write plain text syntax used by different code-oriented frameworks:
Markdown basic syntax and Markdown online editor
Pandoc converter is an extension of Markdown, really usefull for text conversions (Markdown to HTML, LaTex to Markdown, etc.)
| code in R Markdown | Render |
|---|---|
**bold**, __bold__ |
bold, bold |
*italic*, _italic_ |
italic, italic |
code |
code |
hyphen -- inserted -- in a sentence |
hyphen – inserted – in a sentence |
H~2~O |
H2O |
10^−19^ |
10−19 |
$$\sum_{i=1}^{n} X^3_i$$ |
\[\sum_{i=1}^{n} X^3_i\] |
| … | … |
>> I am a blockquote
I am a blockquote
--- or ***
A extended part of the text styling can also be done with HTML/CSS
- numbered
1. first element
2. second element
3. third element
numbered
- bullet
* first element
* second element
* third element
- third element - sub 1
- third element - sub 2
+ third element - sub 2 - subsub 1
+ third element - sub 2 - subsub 2
bullet
| Syntax | Description |
| --- | ----------- |
| Header | Title |
| Paragraph | Text |
and
| Syntax | Description |
| ----------- | ----------- |
| Header | Title |
| Paragraph | Text |
produce the same result:
| Syntax | Description |
|---|---|
| Header | Title |
| Paragraph | Text |
| Left | Center | Right |
| :--- | :----: | ---: |
| Header | Title | Here's this |
| Paragraph | Text | And more |
| Left | Center | Right |
|---|---|---|
| Header | Title | Here’s this |
| Paragraph | Text | And more |
Table are often data structure difficult to layout. For complex tables you will need to use code chunks, e.g. kable() from {
knitr} with kable_styling() from {kableExtra}
These coding give the same results (Markdown and HTML):
{width=100px}
<img src="https://raw.githubusercontent.com/zoometh/oxford/main/R4A/www/logo.png" alt="" width=100>
{width=100px}
<img src="www/logo.png" alt="" width=100>
Extra spaces (HTML only)
  means 4 spaces (= a tabulation)  means 2 spaces means 1 spaceEnd of line (<br> in HTML) with 2 or more spaces and return, for example:
'Reconnais-toi
Cette adorable personne c'est toi
Sous le grand chapeau canotier
Oeil
Nez
Ta Bouche
Voici l’ovale de ta figure
Ton cou Exquis' (Apollinaire, 1913)
‘Reconnais-toi
Cette adorable personne c’est toi
Sous le grand chapeau canotier
Oeil
Nez
Ta Bouche
Voici l’ovale de ta figure
Ton cou Exquis’ (Apollinaire, 1913)
# Header level 1
## Heading level 2
### Heading level 3
etc.
An anchor:
# Practice {#practice}
Avoid numbering:
#### Spaces and end of lines {-}
Avoid numbering and add an anchor:
#### Bookmarks {-#bookmarks}
Report with R Markdown, part 2: [Practice](https://zoometh.github.io/oxford/R4A/2_R Markdown_Practice)
Report with R Markdown, part 2: Practice
[{width=70px}](https://www.unipi.it/index.php/humanities/item/16574-r4rchaeologists)
Hyperlinks can also be done with HTML/CSS
After this theoretical part, you will have to [practice](#practice)
After this theoretical part, you will have to practice
The reference section, ‘Practice’, appears like that:
Footnote to the bottom of the document
A simple footnote,[^1] or a longer one.[^bignote]
A simple footnote,1 or a longer one.2
[^1]: This is the first footnote.
[^bignote]: Here's one with multiple paragraphs and code.
Indent paragraphs to include them in the footnote.
`{ my code }`
Add as many paragraphs as you like.
Cite and reuse variables, figures, tables, sections, etc., through your document
In the following verbatim blocks we use simple quotes (
library(archdata)
data("Handaxes")
number.of.axes <- nrow(Handaxes)
Variables are called like this: ‘r variable_name’
'(...) the Furze Platt dataset counts 'r number.of.axes' described by 'r ncol(Handaxes)'. The maximal length (L = 'r max(Handaxes$L)') (...)'
‘(…) the Furze Platt dataset counts 600 described by 8. The maximal length (L = 242) (…)’
library(archdata)
data("Handaxes")
plot(Handaxes$L, Handaxes$B)
model <- lm(B ~ L, data = Handaxes)
abline(lm(model))
Figure 2.1: Maximum Length/Maximum breadth in cm
'(...) the distribution of the maximum length (L) and maximum breadth (B) shows a R^2^ = 'r round(model$coefficients[2], 2)', Fig. \@ref(fig:maxLmaxB)) (...)'
‘(…) the distribution of the maximum length (L) and maximum breadth (B) shows a R2 = 0.42, Fig. 2.1) (…)’
See section [**Bookmarks**](#bookmarks)
See section Bookmarks
.bib referenced in the YAML header| code in R Markdown | Render |
|---|---|
@Xie22 |
Xie (2022) |
[@Xie22] |
(Xie 2022) |
[credits: @Xie22] |
(credits: Xie 2022) |
published by Yihui Xie [-@Xie20; -@Xie22] |
published by Yihui Xie (2020; 2022) |
| … | … |
Bibliographies and citations (Xie, Dervieux, and Riederer 2020)
Code chunks, or chunks, are the placeholders for the coding part of the document
The chunk header is used to set the output options (show code, size of the output image, alignement, etc.). The first top chunk (the first one in the document) allows to set these options for all other chunks, e.g. knitr::opts_chunk$set(echo = TRUE) will ‘echoing’ all chunks unless you change these options in the following headers
graphical interface for options
run the previous chunks but not this one
run this chunk
Code evaluation
include = FALSE prevents code and results from appearing in the finished file. R Markdown still runs the code in the chunk, and the results can be used by other chunks.
echo = FALSE prevents code, but not the results from appearing in the finished file. This is a useful way to embed figures.
message = FALSE prevents messages that are generated by code from appearing in the finished file.
warning = FALSE prevents warnings that are generated by code from appearing in the finished.
Options of the graphical results
fig.cap = "..." adds a caption to graphical results.
fig.height = 7 height to use in R for plots created by the chunk (in inches)
fig.width = 7 width to use in R for plots created by the chunk (in inches)
fig.align = default how to align graphics in the final document. One of ‘left’, ‘right’, or ‘center’
fig.pos = 'H' fix the output exactly here
The body of a code chunk is
Images can be render with knitr::include_graphics("path/to/image")
knitr::include_graphics("www/munsingen_fib_measures.png")
Figure 2.2: Fibulae measurements (Hodson, 1970)
This is the document header, it contains the metadata (e.g, Title, Authors, date) and the document configuration (e.g. HTML or PDF rendering, table of content). It is composed on key-value pairs:
title: Title
author: Author
date:
"03/02/2022"
“03/02/2022”
"'r format(Sys.time(), '%D')'"
“02/05/23”
"'r format(Sys.time(), '%d %B %Y')'"
“05 February 2023”
…
output:
html_document
pdf_document
…
pdf_document:
toc: yes
toc: table of contents
toc: yes
toc_depth: 4
toc_float:
collapsed: no
…
bibliography: bibliographical references, BibTex format (e.g. https://github.com/zoometh/oxford/blob/main/R4A/references.bib)
…
YAML for R Markdown (Xie, Allaire, and Grolemund 2018)
Can be use in all parts of the document (YAML header, code chunks, narrative parts)
| code in R Markdown (= HTML) | Render |
|---|---|
<span style='font-size: 30px'>Big font</span> |
Big font |
<b>bolded</b> |
bolded |
<span style="color:red">red</span> |
red |
| … | … |
Customize the document with CSS layouts like <notes>this CSS element with a dodgerblue for background and white for text</notes> here: https://github.com/zoometh/oxford/blob/main/R4A/styles.css
Customize the document with CSS layouts like
The CSS file is here: https://github.com/zoometh/oxford/blob/main/R4A/styles.css
The interest of HTML is its ability to be deployed online, with interactive settings. R offers a real framework to create interactive documents, Shiny. Shiny can be integrated into R Markdown
Try: File > New File > Rmarkdown > Shiny, or in the YAML header:
runtime: shiny`
With the {leaflet} package
library(dplyr)
library(leaflet)
munsingen.long <- 7.569587484129203
munsingen.lat <- 46.864709895956004
leaflet(width = "60%", height = "400px") %>%
addTiles(group = 'OSM') %>%
addControl("Munsingen necropolis", position = "bottomright") %>%
addProviderTiles(providers$Esri.WorldImagery, group='Esri.WorldImagery') %>%
addMarkers(munsingen.long,
munsingen.lat,
label = "Munsingen necropolis") %>%
addLayersControl(
baseGroups = c('OSM', 'Esri.WorldImagery')) %>%
addScaleBar(position = "bottomleft")
With the {plotly} package
library(plotly)
library(dplyr)
library(archdata)
data("Fibulae")
Fibulae.ex <- Fibulae
Fibulae.ex$lbl <- paste0("<b>Museum num.: ", Fibulae.ex$Mno, "</b><br>",
"Length: ", Fibulae.ex$Length, "<br>",
"Foot Angle: ", Fibulae.ex$FA, "<br>")
plot_ly(data = Fibulae.ex,
x = ~Length,
y = ~FA,
text = ~lbl,
hoverinfo = "text") %>%
layout(title = "Munsingen fibulae")
With the {rgl} package
library(rgl)
options(rgl.useNULL = TRUE) # avoid the popup RGL device
nb.samp <- 12
# 12 graves with the numerous fibulae
Fibulae.nbGrave <- Fibulae %>%
count(Grave) %>%
arrange(-n) %>%
slice_head(n = nb.samp)
Fibulae.samp <- Fibulae[Fibulae$Grave %in% Fibulae.nbGrave$Grave, ]
# rainbow colors by graves
Fibulae.samp$color <- rainbow(nb.samp)[as.numeric(as.factor(Fibulae.samp$Grave))]
plot3d(
x = Fibulae$Length,
y = Fibulae$FA,
z = Fibulae$BH,
col = Fibulae.samp$color,
type = 's',
xlab = "Length",
ylab ="Foot Angle",
zlab = "Bow Height")
rglwidget()
Export R Markdown in: HTML, PDF, LaTeX, Word, ODT, RTF, Markdown, etc. (see: https://rmarkdown.rstudio.com/lesson-9.html)
Basic. HTML export is the default option of a R Markdown document.
Harder. PDF uses LaTeX syntax, if you don’t have MiKTeX installed, you need to install the {tinytex} package [Xie19].
install.packages('tinytex')
tinytex::install_tinytex()
# .rs.restartR()
To export in PDF, change the YAML key-value pair bookdown::html_document2: to bookdown::pdf_document2: and knit. These temporary files are creates:
At the end of the render these files are deleted
LaTeX (extension of TeX) is a rich plain text syntax for academic writing. To export in TeX, select in the YAML header a PDF export + keep tex:
bookdown::pdf_document2:
keep_tex: true
Alongside Pandoc, exists a lot of online apps making the conversions easier (e.g. Word to HTML)
1) Use 1_Rmarkdown_Theory.Rmd as a model (copy/paste code snippets, Mardown syntax, etc.):
2) 2_Rmarkdown_Practice.Rmd gives an example of the desired structure for the Practice part:
3) references.bibis the document where you will add the bibliographic references of the practice part. Google Scholar is a nice tool to copy/paste BibTex references, e.g. this reference:
4) choose a dataset of those available in the {archdata} package (e.g. ‘Handaxes’), you will have to analyse these data/reuse the code you have already created (run ?archdata to see the list of these dataset)